309 research outputs found

    Convex relaxation methods for graphical models : Lagrangian and maximum entropy approaches

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 241-257).Graphical models provide compact representations of complex probability distributions of many random variables through a collection of potential functions defined on small subsets of these variables. This representation is defined with respect to a graph in which nodes represent random variables and edges represent the interactions among those random variables. Graphical models provide a powerful and flexible approach to many problems in science and engineering, but also present serious challenges owing to the intractability of optimal inference and estimation over general graphs. In this thesis, we consider convex optimization methods to address two central problems that commonly arise for graphical models. First, we consider the problem of determining the most probable configuration-also known as the maximum a posteriori (MAP) estimate-of all variables in a graphical model, conditioned on (possibly noisy) measurements of some variables. This general problem is intractable, so we consider a Lagrangian relaxation (LR) approach to obtain a tractable dual problem. This involves using the Lagrangian decomposition technique to break up an intractable graph into tractable subgraphs, such as small "blocks" of nodes, embedded trees or thin subgraphs. We develop a distributed, iterative algorithm that minimizes the Lagrangian dual function by block coordinate descent. This results in an iterative marginal-matching procedure that enforces consistency among the subgraphs using an adaptation of the well-known iterative scaling algorithm. This approach is developed both for discrete variable and Gaussian graphical models. In discrete models, we also introduce a deterministic annealing procedure, which introduces a temperature parameter to define a smoothed dual function and then gradually reduces the temperature to recover the (non-differentiable) Lagrangian dual. When strong duality holds, we recover the optimal MAP estimate. We show that this occurs for a broad class of "convex decomposable" Gaussian graphical models, which generalizes the "pairwise normalizable" condition known to be important for iterative estimation in Gaussian models.(cont.) In certain "frustrated" discrete models a duality gap can occur using simple versions of our approach. We consider methods that adaptively enhance the dual formulation, by including more complex subgraphs, so as to reduce the duality gap. In many cases we are able to eliminate the duality gap and obtain the optimal MAP estimate in a tractable manner. We also propose a heuristic method to obtain approximate solutions in cases where there is a duality gap. Second, we consider the problem of learning a graphical model (both the graph and its potential functions) from sample data. We propose the maximum entropy relaxation (MER) method, which is the convex optimization problem of selecting the least informative (maximum entropy) model over an exponential family of graphical models subject to constraints that small subsets of variables should have marginal distributions that are close to the distribution of sample data. We use relative entropy to measure the divergence between marginal probability distributions. We find that MER leads naturally to selection of sparse graphical models. To identify this sparse graph efficiently, we use a "bootstrap" method that constructs the MER solution by solving a sequence of tractable subproblems defined over thin graphs, including new edges at each step to correct for large marginal divergences that violate the MER constraint. The MER problem on each of these subgraphs is efficiently solved using the primaldual interior point method (implemented so as to take advantage of efficient inference methods for thin graphical models). We also consider a dual formulation of MER that minimizes a convex function of the potentials of the graphical model. This MER dual problem can be interpreted as a robust version of maximum-likelihood parameter estimation, where the MER constraints specify the uncertainty in the sufficient statistics of the model. This also corresponds to a regularized maximum-likelihood approach, in which an information-geometric regularization term favors selection of sparse potential representations. We develop a relaxed version of the iterative scaling method to solve this MER dual problem.by Jason K. Johnson.Ph.D

    The state of the Martian climate

    Get PDF
    60°N was +2.0°C, relative to the 1981–2010 average value (Fig. 5.1). This marks a new high for the record. The average annual surface air temperature (SAT) anomaly for 2016 for land stations north of starting in 1900, and is a significant increase over the previous highest value of +1.2°C, which was observed in 2007, 2011, and 2015. Average global annual temperatures also showed record values in 2015 and 2016. Currently, the Arctic is warming at more than twice the rate of lower latitudes

    US Cosmic Visions: New Ideas in Dark Matter 2017: Community Report

    Get PDF
    This white paper summarizes the workshop "U.S. Cosmic Visions: New Ideas in Dark Matter" held at University of Maryland on March 23-25, 2017.Comment: 102 pages + reference

    The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons

    Get PDF
    To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences

    Genetic fine mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci.

    Get PDF
    We performed fine mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in or near KCNQ1. 'Credible sets' of the variants most likely to drive each distinct signal mapped predominantly to noncoding sequence, implying that association with T2D is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine mapping implicated rs10830963 as driving T2D association. We confirmed that the T2D risk allele for this SNP increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease

    A many-analysts approach to the relation between religiosity and well-being

    Get PDF
    The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset (N=10,535 participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported β=0.120). For the second research question, this was the case for 65% of the teams (median reported β=0.039). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates

    A Many-analysts Approach to the Relation Between Religiosity and Well-being

    Get PDF
    The relation between religiosity and well-being is one of the most researched topics in the psychology of religion, yet the directionality and robustness of the effect remains debated. Here, we adopted a many-analysts approach to assess the robustness of this relation based on a new cross-cultural dataset (N = 10, 535 participants from 24 countries). We recruited 120 analysis teams to investigate (1) whether religious people self-report higher well-being, and (2) whether the relation between religiosity and self-reported well-being depends on perceived cultural norms of religion (i.e., whether it is considered normal and desirable to be religious in a given country). In a two-stage procedure, the teams first created an analysis plan and then executed their planned analysis on the data. For the first research question, all but 3 teams reported positive effect sizes with credible/confidence intervals excluding zero (median reported β = 0.120). For the second research question, this was the case for 65% of the teams (median reported β = 0.039). While most teams applied (multilevel) linear regression models, there was considerable variability in the choice of items used to construct the independent variables, the dependent variable, and the included covariates

    Sloan Digital Sky Survey IV: mapping the Milky Way, nearby galaxies, and the distant universe

    Get PDF
    We describe the Sloan Digital Sky Survey IV (SDSS-IV), a project encompassing three major spectroscopic programs. The Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2) is observing hundreds of thousands of Milky Way stars at high resolution and high signal-to-noise ratios in the near-infrared. The Mapping Nearby Galaxies at Apache Point Observatory (MaNGA) survey is obtaining spatially resolved spectroscopy for thousands of nearby galaxies (median ). The extended Baryon Oscillation Spectroscopic Survey (eBOSS) is mapping the galaxy, quasar, and neutral gas distributions between and 3.5 to constrain cosmology using baryon acoustic oscillations, redshift space distortions, and the shape of the power spectrum. Within eBOSS, we are conducting two major subprograms: the SPectroscopic IDentification of eROSITA Sources (SPIDERS), investigating X-ray AGNs and galaxies in X-ray clusters, and the Time Domain Spectroscopic Survey (TDSS), obtaining spectra of variable sources. All programs use the 2.5 m Sloan Foundation Telescope at the Apache Point Observatory; observations there began in Summer 2014. APOGEE-2 also operates a second near-infrared spectrograph at the 2.5 m du Pont Telescope at Las Campanas Observatory, with observations beginning in early 2017. Observations at both facilities are scheduled to continue through 2020. In keeping with previous SDSS policy, SDSS-IV provides regularly scheduled public data releases; the first one, Data Release 13, was made available in 2016 July

    Sloan Digital Sky Survey IV: Mapping the Milky Way, Nearby Galaxies, and the Distant Universe

    Get PDF
    We describe the Sloan Digital Sky Survey IV (SDSS-IV), a project encompassing three major spectroscopic programs. The Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2) is observing hundreds of thousands of Milky Way stars at high resolution and high signal-to-noise ratios in the near-infrared. The Mapping Nearby Galaxies at Apache Point Observatory (MaNGA) survey is obtaining spatially resolved spectroscopy for thousands of nearby galaxies (median z0.03z\sim 0.03). The extended Baryon Oscillation Spectroscopic Survey (eBOSS) is mapping the galaxy, quasar, and neutral gas distributions between z0.6z\sim 0.6 and 3.5 to constrain cosmology using baryon acoustic oscillations, redshift space distortions, and the shape of the power spectrum. Within eBOSS, we are conducting two major subprograms: the SPectroscopic IDentification of eROSITA Sources (SPIDERS), investigating X-ray AGNs and galaxies in X-ray clusters, and the Time Domain Spectroscopic Survey (TDSS), obtaining spectra of variable sources. All programs use the 2.5 m Sloan Foundation Telescope at the Apache Point Observatory; observations there began in Summer 2014. APOGEE-2 also operates a second near-infrared spectrograph at the 2.5 m du Pont Telescope at Las Campanas Observatory, with observations beginning in early 2017. Observations at both facilities are scheduled to continue through 2020. In keeping with previous SDSS policy, SDSS-IV provides regularly scheduled public data releases; the first one, Data Release 13, was made available in 2016 July
    corecore